AITopics | local minimizer

We study the minimization of non-convex functionals over the Wasserstein space. While recent work has showed that perturbed Wasserstein gradient methods can avoid saddle points for benign landscapes, existing approaches remain essentially first-order and do not provide fast local convergence once the iterates enter a neighborhood of a global minimizer. We propose Wasserstein Saddle-Free Newton (WSFN), a second-order method that preconditions the Wasserstein gradient by a regularized square root of the squared Wasserstein Hessian. This construction preserves attraction toward directions of positive curvature while inducing repulsion along directions of negative curvature, thereby overcoming the tendency of standard Wasserstein Newton dynamics to be attracted to saddles. We also establish second-order sufficient optimality conditions on Wasserstein space for strict local minimality. Under regularity and benign landscape assumptions, we prove that WSFN escapes saddle regions and reaches an $α$-neighborhood of a global minimizer in polynomial time, with improved dependence on saddle parameters compared with prior perturbed first-order methods. Once inside this neighborhood, we show that WSFN converges linearly in $L^2$-Wasserstein distance to a non-degenerate global minimizer. Finally, we present a particle-based implementation of the method.

artificial intelligence, machine learning, minimizer, (17 more...)

arXiv.org Machine Learning

2605.17963

Country:

Asia > Japan (0.28)
North America > United States (0.27)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

WiVi Y

Neural Information Processing SystemsApr-24-2026, 23:33:07 GMT

See Appendix A.9 for the derivation of the Hessian. We have proved the statement (a). Notice that this is a contradiction because any point with WL+1 = 0 is in the set S. Hence, there exists no point at which the Hessian is negative semidefinite. Because the negative semidefiniteness is a necessary condition for a local maximum, every critical point is then either a local minimum or a saddle point. We have proved the statement (b).

artificial intelligence, machine learning, optimization problem, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

1bf50aaf147b3b0ddd26a820d2ed394d-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 23:33:04 GMT

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Technology:

Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

9b8b50fb590c590ffbf1295ce92258dc-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 12:55:30 GMT

eigenvalue, equation, simulation, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada (0.04)
Europe > France (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

5d97f4dd7c44b2905c799db681b80ce0-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 14:15:24 GMT

Convex optimization problems with staged structure appear in several contexts, including optimal control, verification of deep neural networks, and isotonic regression. Off-the-shelf solvers can solve these problems but may scale poorly.

artificial intelligence, machine learning, minimizer, (20 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

5d97f4dd7c44b2905c799db681b80ce0-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 14:15:17 GMT

algorithm, local minimizer, minimizer, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

349a45f211fb1b3850da1ccd829e869e-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 06:54:45 GMT

critical point, morse-bott function, optimization, (15 more...)

Neural Information Processing Systems

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

1bf50aaf147b3b0ddd26a820d2ed394d-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 17:26:40 GMT

representation, residual block, resnest, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Riemannian stochastic optimization methods avoid strict saddle points

Neural Information Processing SystemsDec-25-2025, 13:46:14 GMT

Many modern machine learning applications - from online principal component analysis to covariance matrix identification and dictionary learning - can be formulated as minimization problems on Riemannian manifolds, typically solved with a Riemannian stochastic gradient method (or some variant thereof). However, in many cases of interest, the resulting minimization problem is _not_ geodesically convex, so the convergence of the chosen solver to a desirable solution - i.e., a local minimizer - is by no means guaranteed. In this paper, we study precisely this question, that is, whether stochastic Riemannian optimization algorithms are guaranteed to avoid saddle points with probability $1$. For generality, we study a family of retraction-based methods which, in addition to having a potentially much lower per-iteration cost relative to Riemannian gradient descent, include other widely used algorithms, such as natural policy gradient methods and mirror descent in ordinary convex spaces. In this general setting, we show that, under mild assumptions for the ambient manifold and the oracle providing gradient information, the policies under study avoid strict saddle points / submanifolds with probability $1$, from any initial condition. This result provides an important sanity check for the use of gradient methods on manifolds as it shows that, almost always, the end state of a stochastic Riemannian algorithm can only be a local minimizer.

gradient method, name change, strict saddle point, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.59)

Add feedback

Structured Local Minima in Sparse Blind Deconvolution

Yuqian Zhang, Han-wen Kuo, John Wright

Neural Information Processing SystemsNov-20-2025, 14:56:39 GMT

This paper focuses on the short and sparse blind deconvolu-tion problem, where the one unknown signal is short and the other one is sparsely and randomly supported.

artificial intelligence, deconvolution, optimization problem, (16 more...)

Neural Information Processing Systems

Country: